318 research outputs found
Hard Regularization to Prevent Collapse in Online Deep Clustering without Data Augmentation
Online deep clustering refers to the joint use of a feature extraction
network and a clustering model to assign cluster labels to each new data point
or batch as it is processed. While faster and more versatile than offline
methods, online clustering can easily reach the collapsed solution where the
encoder maps all inputs to the same point and all are put into a single
cluster. Successful existing models have employed various techniques to avoid
this problem, most of which require data augmentation or which aim to make the
average soft assignment across the dataset the same for each cluster. We
propose a method that does not require data augmentation, and that, differently
from existing methods, regularizes the hard assignments. Using a Bayesian
framework, we derive an intuitive optimization objective that can be
straightforwardly included in the training of the encoder network. Tested on
four image datasets, we show that it consistently avoids collapse more robustly
than other methods and that it leads to more accurate clustering. We also
conduct further experiments and analyses justifying our choice to regularize
the hard cluster assignments
Efficient Deep Clustering of Human Activities and How to Improve Evaluation
There has been much recent research on human activity re\-cog\-ni\-tion
(HAR), due to the proliferation of wearable sensors in watches and phones, and
the advances of deep learning methods, which avoid the need to manually extract
features from raw sensor signals. A significant disadvantage of deep learning
applied to HAR is the need for manually labelled training data, which is
especially difficult to obtain for HAR datasets. Progress is starting to be
made in the unsupervised setting, in the form of deep HAR clustering models,
which can assign labels to data without having been given any labels to train
on, but there are problems with evaluating deep HAR clustering models, which
makes assessing the field and devising new methods difficult. In this paper, we
highlight several distinct problems with how deep HAR clustering models are
evaluated, describing these problems in detail and conducting careful
experiments to explicate the effect that they can have on results. We then
discuss solutions to these problems, and suggest standard evaluation settings
for future deep HAR clustering models. Additionally, we present a new deep
clustering model for HAR. When tested under our proposed settings, our model
performs better than (or on par with) existing models, while also being more
efficient and better able to scale to more complex datasets by avoiding the
need for an autoencoder
Cross-Dictionary Linking at Sense Level with a Double-Layer Classifier
We present a system for linking dictionaries at the sense level, which is part of a wider programme aiming to extend current lexical resources and to create new ones by automatic means. One of the main challenges of the sense linking task is the existence of non one-to-one mappings among senses. Our system handles this issue by addressing the task as a binary classification problem using standard Machine Learning methods, where each sense pair is classified independently from the others. In addition, it implements a second, statistically-based classification layer to also model the dependence existing among sense pairs, namely, the fact that a sense in one dictionary that is already linked to a sense in the other dictionary has a lower probability of being linked to a further sense. The resulting double-layer classifier achieves global Precision and Recall scores of 0.91 and 0.80, respectively
Correcting Flaws in Common Disentanglement Metrics
Recent years have seen growing interest in learning disentangled
representations, in which distinct features, such as size or shape, are
represented by distinct neurons. Quantifying the extent to which a given
representation is disentangled is not straightforward; multiple metrics have
been proposed. In this paper, we identify two failings of existing metrics,
which mean they can assign a high score to a model which is still entangled,
and we propose two new metrics, which redress these problems. We then consider
the task of compositional generalization. Unlike prior works, we treat this as
a classification problem, which allows us to use it to measure the
disentanglement ability of the encoder, without depending on the decoder. We
show that performance on this task is (a) generally quite poor, (b) correlated
with most disentanglement metrics, and (c) most strongly correlated with our
newly proposed metrics
Berkeley's criterion of truth
This item was digitized by the Internet Archive
Knowledge Graph Extraction from Videos
Nearly all existing techniques for automated video annotation (or captioning)
describe videos using natural language sentences. However, this has several
shortcomings: (i) it is very hard to then further use the generated natural
language annotations in automated data processing, (ii) generating natural
language annotations requires to solve the hard subtask of generating
semantically precise and syntactically correct natural language sentences,
which is actually unrelated to the task of video annotation, (iii) it is
difficult to quantitatively measure performance, as standard metrics (e.g.,
accuracy and F1-score) are inapplicable, and (iv) annotations are
language-specific. In this paper, we propose the new task of knowledge graph
extraction from videos, i.e., producing a description in the form of a
knowledge graph of the contents of a given video. Since no datasets exist for
this task, we also include a method to automatically generate them, starting
from datasets where videos are annotated with natural language. We then
describe an initial deep-learning model for knowledge graph extraction from
videos, and report results on MSVD* and MSR-VTT*, two datasets obtained from
MSVD and MSR-VTT using our method.Comment: 10 pages, 4 figure
- …